SDA 3.5 Documentation for CASEStoDDL


NAME

CASEStoDDL - Create a DDL file from CASES (version 4)

DESCRIPTION

SDA programs can be used to document data collected by the CASES system for computer-assisted interviewing. In order to use the SDA programs, a DDL file needs to be generated that describes a specific data file generated by the CASES system. This document summarizes the procedures necessary to create such a DDL file.

There are many options available to customize the content of a DDL file produced from a CASES instrument. However, there are default options which will usually produce satisfactory results, at least as a starting point. The purpose of this document is to illustrate how to run the procedures in a simple way, taking advantage of the default options.

STEPS OF THE PROCESS

  1. Create a list of variables to output
  2. Create a list of cases to include in the data file
  3. Create the data file by running the CASES ‘output’ program
  4. Run the SDA ‘q4toddl’ program
  5. Add study-level information to the DDL file
  6. Check the DDL file

1. CREATE A LIST OF VARIABLES TO OUTPUT

Create a list of variables that you want to include in your data file. The list of variables should have one variable name on each line. Note that there are usually many variables used by the interviewing system that you will not want to pass through to the data file.

One way to obtain a list of variables is to run the CASES ‘layout’ program to generate a list of variables. When running the ‘layout’ program, use the ‘-b’ (brief) and the ‘-o’ (only variables) options.

That list produced by ‘layout’ can then be edited down to the variables that are of substantive interest -- by deleting fills, and various other non-input items.

Save the final list of variables in a file named something like ‘myvars’, for input into the third stage of the process.


2. CREATE A LIST OF CASES TO INCLUDE IN THE DATA FILE

You will usually want to include only the completed cases in the data file. In order to do this, you must prepare a list of the cases to be output by the CASES system.

The CASES ‘caselist’ program produces lists of cases, according to criteria that you specify. The precise criteria to use depend on your treatment of completed cases -- whether they have been run through the second-stage cleaning process, for example. Typically, you would specify that the ‘caselist’ program produce a list of all cases that are in one of the following stages: in ‘middle’ or in ‘ready’ or in ‘certified’.

Save that list of cases in a file named something like ‘idlist’, for input into the next stage of the process


3. CREATE THE DATA FILE BY RUNNING THE CASES OUTPUT PROGRAM

To create the data file that will be documented, run the CASES ‘output’ program, using the ‘-i = filename’ option. (Do NOT run the CASES ‘output’ program without the ‘-i’ option. See below for an explanation of why not.) The ‘-i’ option is used to specify the name of a file containing a list of the variables that you want to include in the data file. This is the file you produced in step #1 above.

For example, if the file ‘myvars’ contains a list of variables for CASES to output, and if the file ‘idlist’ contains a list of the case IDs to be output, you could use the following command:

output -i=myvars -ou=mydata  idlist
In this example, the ‘output’ program would generate two files: If you do NOT use the ‘-i’ option, the ‘output’ program will produce a large data file with many variables you probably do not want to include in a dataset for analysis. Also, you will not get a layout file for the variables you want -- rather, you will have to rely on the comprehensive layout produced by the CASES ‘layout’ program. That layout refers to variables from the ZERO record as being located in record 0, which will cause problems if you try to pass those locations on to other programs.

4. RUN THE SDA ‘Q4TODDL’ PROGRAM TO MAKE A DDL FILE

The Q4TODDL program gathers information both from the CASES instrument and from the layout file, and then it puts the pieces together in the form of a DDL file. The text of questions and the category labels are taken from the CASES instrument. The location of each variable in the data file is taken from the layout file.

The process includes the following steps:


5. ADD STUDY-LEVEL INFORMATION TO THE DDL FILE

Information about the dataset as a whole is not contained either in the CASES Q-language file or in the layout file. You can edit that information manually into the DDL file.

The main required elements are:

For a complete description of the required format of a DDL file, see the DDL document.


6. CHECK THE DDL FILE

After you have run Q4TODDL, added the required study-level information, and made any other changes you want, you can check the resulting DDL file for syntax errors. The MAKESDA program will do this for you, if you use the ‘-c’ option.

For a DDL file named ’myddl.txt’, you would give the following command:

makesda -c -l myddl.txt
Some messages will appear on the screen. A fuller report will be appended to the file ‘MAKESDA.MSG’. Also, note that a list of all variables processed will be put into the file ‘MAKESDA.LST’.

Once you have a DDL file without errors, you can proceed to create an SDA dataset and generate a codebook.


SEE ALSO:

DDL Data Description Language
makesda Make an SDA dataset from a DDL file and a data file
q4toddl Convert CASES Q-language files to DDL
xcodebk Produce a codebook


CSM, UC Berkeley
April 12, 2011